Locating Data Sources in Large Distributed Systems

نویسندگان

  • Leonidas Galanis
  • Yuan Wang
  • Shawn R. Jeffery
  • David J. DeWitt
چکیده

Querying large numbers of data sources is gaining importance due to increasing numbers of independent data providers. One of the key challenges is executing queries on all relevant information sources in a scalable fashion and retrieving fresh results. The key to scalability is to send queries only to the relevant servers and avoid wasting resources on data sources which will not provide any results. Thus, a catalog service, which would determine the relevant data sources given a query, is an essential component in efficiently processing queries in a distributed environment. This paper proposes a catalog framework which is distributed across the data sources themselves and does not require any central infrastructure. As new data sources become available, they automatically become part of the catalog service infrastructure, which allows scalability to large numbers of nodes. Furthermore, we propose techniques for workload adaptability. Using simulation and real-world data we show that our approach is valid and can scale to thousands of data sources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Placement of DGs in Distribution System including Different Load Models for Loss Reduction using Genetic Algorithm

Distributed generation (DG) sources are becoming more prominent in distribution systems due to the incremental demands for electrical energy. Locations and capacities of DG sources have great impacts on the system losses in a distribution network. This paper presents a study aimed for optimally determining the size and location of distributed generation units in distribution systems with differ...

متن کامل

Optimal Placement of DGs in Distribution System including Different Load Models for Loss Reduction using Genetic Algorithm

Distributed generation (DG) sources are becoming more prominent in distribution systems due to the incremental demands for electrical energy. Locations and capacities of DG sources have great impacts on the system losses in a distribution network. This paper presents a study aimed for optimally determining the size and location of distributed generation units in distribution systems with differ...

متن کامل

Data Model and Query Evaluation inGlobal Information

Global information systems involve a large number of information sources distributed over computer networks. The variety of information sources and disparity of interfaces makes the task of easily locating and eeciently accessing information over the network very cumbersome. We describe an architecture for global information systems that is especially tailored to address the challenges raised i...

متن کامل

A Multiagent-based Framework for Integrating Biological Data

Biological data has been rapidly increasing in volume in different Web data sources. To query multiple data sources manually on the internet is time consuming for biologists. Therefore, systems and tools that facilitate searching multiple biological data sources are needed. Traditional approaches to build distributed or federated systems do not scale well to the large, diverse, and the growing ...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003